An approach for efficient open vocabulary spoken term detection
نویسندگان
چکیده
A hybrid two-pass approach for facilitating fast and efficient open vocabulary spoken term detection (STD) is presented in this paper. A large vocabulary continuous speech recognition (LVCSR) system is deployed for producing word lattices from audio recordings. An index construction technique is used for facilitating very fast search of lattices for finding occurrences of both in vocabulary (IV) and out of vocabulary (OOV) query terms. Efficient search for query terms is performed in two passes. In the first pass, a subword approach is used for identifying audio segments that are likely to contain occurrences of the IV and OOV query terms from the index. A more detailed subword based search is performed in the second pass for verifying the occurrence of the query terms in the candidate segments. The performance of this STD system is evaluated in an open vocabulary STD task defined on a lecture domain corpus. It is shown that the indexing method presented here results in an index that is nearly two orders of magnitude smaller than the LVCSR lattices while preserving most of the information relevant for STD. Furthermore, despite using word lattices for constructing the index, 67% of the segments containing occurrences of the OOV query terms are identified from the index in the first pass. Finally, it is shown that the detection performance of the subword based term detection performed in the second pass has the effect of reducing the performance gap between OOV and IV query terms. 2013 Elsevier B.V. All rights reserved.
منابع مشابه
Fast subword-based approach for open vocabulary spoken term detection
This paper describes an efficient two-stage approach using sub-phonetic segment N-gram index and shift continuous dynamic programming for open vocabulary spoken term detection. With this two-stage search, we attempt to improve performance in both retrieval accuracy and process time. In the speech recognition process, a more sophisticated subword that is shorter than phonemes is used to minimize...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملHybrid word-subword spoken term detection
The thesis investigates into keyword spotting and spoken term detection (STD), that are considered as sub-sets of spoken document retrieval. It deals with two-phase approaches where speech is first processed by speech recognizer, and the search for queries is performed in the output of this recognizer. Standard large vocabulary continuous speech recognizer (LVCSR) with fixed vocabulary is not c...
متن کاملFast decoding for open vocabulary spoken term detection
Information retrieval and spoken-term detection from audio such as broadcast news, telephone conversations, conference calls, and meetings are of great interest to the academic, government, and business communities. Motivated by the requirement for high-quality indexes, this study explores the effect of using both word and sub-word information to find in-vocabulary and OOV query terms. It also ...
متن کاملContextual verification for open vocabulary spoken term detection
In spoken term detection, subword speech recognition is a viable means for addressing the out-of-vocabulary (OOV) problem at query time. Applying fuzzy error compensation techniques is needed for coping with inevitable recognition errors, but can lead to high false alarm rates especially for short queries. We propose two novel methods which reject false alarms based on the context of the hypoth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 57 شماره
صفحات -
تاریخ انتشار 2014